Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
J Exp Med ; 221(6)2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38597954

RESUMEN

Early stages of deadly respiratory diseases including COVID-19 are challenging to elucidate in humans. Here, we define cellular tropism and transcriptomic effects of SARS-CoV-2 virus by productively infecting healthy human lung tissue and using scRNA-seq to reconstruct the transcriptional program in "infection pseudotime" for individual lung cell types. SARS-CoV-2 predominantly infected activated interstitial macrophages (IMs), which can accumulate thousands of viral RNA molecules, taking over 60% of the cell transcriptome and forming dense viral RNA bodies while inducing host profibrotic (TGFB1, SPP1) and inflammatory (early interferon response, CCL2/7/8/13, CXCL10, and IL6/10) programs and destroying host cell architecture. Infected alveolar macrophages (AMs) showed none of these extreme responses. Spike-dependent viral entry into AMs used ACE2 and Sialoadhesin/CD169, whereas IM entry used DC-SIGN/CD209. These results identify activated IMs as a prominent site of viral takeover, the focus of inflammation and fibrosis, and suggest targeting CD209 to prevent early pathology in COVID-19 pneumonia. This approach can be generalized to any human lung infection and to evaluate therapeutics.


Asunto(s)
COVID-19 , Humanos , SARS-CoV-2 , Macrófagos , Inflamación , ARN Viral , Pulmón
2.
Proc Natl Acad Sci U S A ; 121(15): e2304671121, 2024 Apr 09.
Artículo en Inglés | MEDLINE | ID: mdl-38564640

RESUMEN

Contingency tables, data represented as counts matrices, are ubiquitous across quantitative research and data-science applications. Existing statistical tests are insufficient however, as none are simultaneously computationally efficient and statistically valid for a finite number of observations. In this work, motivated by a recent application in reference-free genomic inference [K. Chaung et al., Cell 186, 5440-5456 (2023)], we develop Optimized Adaptive Statistic for Inferring Structure (OASIS), a family of statistical tests for contingency tables. OASIS constructs a test statistic which is linear in the normalized data matrix, providing closed-form P-value bounds through classical concentration inequalities. In the process, OASIS provides a decomposition of the table, lending interpretability to its rejection of the null. We derive the asymptotic distribution of the OASIS test statistic, showing that these finite-sample bounds correctly characterize the test statistic's P-value up to a variance term. Experiments on genomic sequencing data highlight the power and interpretability of OASIS. Using OASIS, we develop a method that can detect SARS-CoV-2 and Mycobacterium tuberculosis strains de novo, which existing approaches cannot achieve. We demonstrate in simulations that OASIS is robust to overdispersion, a common feature in genomic data like single-cell RNA sequencing, where under accepted noise models OASIS provides good control of the false discovery rate, while Pearson's [Formula: see text] consistently rejects the null. Additionally, we show in simulations that OASIS is more powerful than Pearson's [Formula: see text] in certain regimes, including for some important two group alternatives, which we corroborate with approximate power calculations.


Asunto(s)
Genoma , Genómica , Mapeo Cromosómico
3.
bioRxiv ; 2024 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-36993432

RESUMEN

SPLASH is an unsupervised, reference-free, and unifying algorithm that discovers regulated sequence variation through statistical analysis of k-mer composition, subsuming many application-specific methods. Here, we introduce SPLASH2, a fast, scalable implementation of SPLASH based on an efficient k-mer counting approach. SPLASH2 enables rapid analysis of massive datasets from a wide range of sequencing technologies and biological contexts, delivering unparalleled scale and speed. The SPLASH2 algorithm unveils new biology (without tuning) in single-cell RNA-sequencing data from human muscle cells, as well as bulk RNA-seq from the entire Cancer Cell Line Encyclopedia (CCLE), including substantial unannotated alternative splicing in cancer transcriptome. The same untuned SPLASH2 algorithm recovers the BCR-ABL gene fusion, and detects circRNA sensitively and specifically, underscoring SPLASH2's unmatched precision and scalability across diverse RNA-seq detection tasks.

4.
Cell ; 186(25): 5440-5456.e26, 2023 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-38065078

RESUMEN

Today's genomics workflows typically require alignment to a reference sequence, which limits discovery. We introduce a unifying paradigm, SPLASH (Statistically Primary aLignment Agnostic Sequence Homing), which directly analyzes raw sequencing data, using a statistical test to detect a signature of regulation: sample-specific sequence variation. SPLASH detects many types of variation and can be efficiently run at scale. We show that SPLASH identifies complex mutation patterns in SARS-CoV-2, discovers regulated RNA isoforms at the single-cell level, detects the vast sequence diversity of adaptive immune receptors, and uncovers biology in non-model organisms undocumented in their reference genomes: geographic and seasonal variation and diatom association in eelgrass, an oceanic plant impacted by climate change, and tissue-specific transcripts in octopus. SPLASH is a unifying approach to genomic analysis that enables expansive discovery without metadata or references.


Asunto(s)
Algoritmos , Genómica , Genoma , Análisis de Secuencia de ARN , Humanos , Antígenos HLA/genética , Análisis de la Célula Individual
5.
bioRxiv ; 2023 Nov 03.
Artículo en Inglés | MEDLINE | ID: mdl-37961606

RESUMEN

Contingency tables, data represented as counts matrices, are ubiquitous across quantitative research and data-science applications. Existing statistical tests are insufficient however, as none are simultaneously computationally efficient and statistically valid for a finite number of observations. In this work, motivated by a recent application in reference-free genomic inference (1), we develop OASIS (Optimized Adaptive Statistic for Inferring Structure), a family of statistical tests for contingency tables. OASIS constructs a test-statistic which is linear in the normalized data matrix, providing closed form p-value bounds through classical concentration inequalities. In the process, OASIS provides a decomposition of the table, lending interpretability to its rejection of the null. We derive the asymptotic distribution of the OASIS test statistic, showing that these finite-sample bounds correctly characterize the test statistic's p-value up to a variance term. Experiments on genomic sequencing data highlight the power and interpretability of OASIS. The same method based on OASIS significance calls detects SARS-CoV-2 and Mycobacterium Tuberculosis strains de novo, which cannot be achieved with current approaches. We demonstrate in simulations that OASIS is robust to overdispersion, a common feature in genomic data like single cell RNA-sequencing, where under accepted noise models OASIS still provides good control of the false discovery rate, while Pearson's X2 test consistently rejects the null. Additionally, we show on synthetic data that OASIS is more powerful than Pearson's X2 test in certain regimes, including for some important two group alternatives, which we corroborate with approximate power calculations.

6.
Genome Biol ; 24(1): 240, 2023 10 20.
Artículo en Inglés | MEDLINE | ID: mdl-37864197

RESUMEN

Diversity-generating and mobile genetic elements are key to microbial and viral evolution and can result in evolutionary leaps. State-of-the-art algorithms to detect these elements have limitations. Here, we introduce DIVE, a new reference-free approach to overcome these limitations using information contained in sequencing reads alone. We show that DIVE has improved detection power compared to existing reference-based methods using simulations and real data. We use DIVE to rediscover and characterize the activity of known and novel elements and generate new biological hypotheses about the mobilome. Building on DIVE, we develop a reference-free framework capable of de novo discovery of mobile genetic elements.


Asunto(s)
Transferencia de Gen Horizontal , Secuencias Repetitivas Esparcidas , Elementos Transponibles de ADN
7.
bioRxiv ; 2023 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-37503014

RESUMEN

The authors have withdrawn this manuscript due to a duplicate posting of manuscript number BIORXIV/2022/497555. Therefore, the authors do not wish this work to be cited as reference for the project. If you have any questions, please contact the corresponding author. The correct preprint can be found at doi: https://doi.org/10.1101/2022.06.24.497555.

8.
Nat Methods ; 20(8): 1159-1169, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37443337

RESUMEN

The detection of circular RNA molecules (circRNAs) is typically based on short-read RNA sequencing data processed using computational tools. Numerous such tools have been developed, but a systematic comparison with orthogonal validation is missing. Here, we set up a circRNA detection tool benchmarking study, in which 16 tools detected more than 315,000 unique circRNAs in three deeply sequenced human cell types. Next, 1,516 predicted circRNAs were validated using three orthogonal methods. Generally, tool-specific precision is high and similar (median of 98.8%, 96.3% and 95.5% for qPCR, RNase R and amplicon sequencing, respectively) whereas the sensitivity and number of predicted circRNAs (ranging from 1,372 to 58,032) are the most significant differentiators. Of note, precision values are lower when evaluating low-abundance circRNAs. We also show that the tools can be used complementarily to increase detection sensitivity. Finally, we offer recommendations for future circRNA detection and validation.


Asunto(s)
Benchmarking , ARN Circular , Humanos , ARN Circular/genética , ARN/genética , ARN/metabolismo , Análisis de Secuencia de ARN/métodos
9.
bioRxiv ; 2023 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-36993757

RESUMEN

Technical advances have led to an explosion in the amount of biological data available in recent years, especially in the field of RNA sequencing. Specifically, spatial transcriptomics (ST) datasets, which allow each RNA molecule to be mapped to the 2D location it originated from within a tissue, have become readily available. Due to computational challenges, ST data has rarely been used to study RNA processing such as splicing or differential UTR usage. We apply the ReadZS and the SpliZ, methods developed to analyze RNA process in scRNA-seq data, to analyze spatial localization of RNA processing directly from ST data for the first time. Using Moran's I metric for spatial autocorrelation, we identify genes with spatially regulated RNA processing in the mouse brain and kidney, re-discovering known spatial regulation in Myl6 and identifying previously-unknown spatial regulation in genes such as Rps24, Gng13, Slc8a1, Gpm6a, Gpx3, ActB, Rps8, and S100A9. The rich set of discoveries made here from commonly used reference datasets provides a small taste of what can be learned by applying this technique more broadly to the large quantity of Visium data currently being created.

11.
bioRxiv ; 2023 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-35794890

RESUMEN

Today's genomics workflows typically require alignment to a reference sequence, which limits discovery. We introduce a new unifying paradigm, SPLASH (Statistically Primary aLignment Agnostic Sequence Homing), an approach that directly analyzes raw sequencing data to detect a signature of regulation: sample-specific sequence variation. The approach, which includes a new statistical test, is computationally efficient and can be run at scale. SPLASH unifies detection of myriad forms of sequence variation. We demonstrate that SPLASH identifies complex mutation patterns in SARS-CoV-2 strains, discovers regulated RNA isoforms at the single cell level, documents the vast sequence diversity of adaptive immune receptors, and uncovers biology in non-model organisms undocumented in their reference genomes: geographic and seasonal variation and diatom association in eelgrass, an oceanic plant impacted by climate change, and tissue-specific transcripts in octopus. SPLASH is a new unifying approach to genomic analysis that enables an expansive scope of discovery without metadata or references.

12.
Genome Biol ; 23(1): 226, 2022 10 25.
Artículo en Inglés | MEDLINE | ID: mdl-36284317

RESUMEN

RNA processing, including splicing and alternative polyadenylation, is crucial to gene function and regulation, but methods to detect RNA processing from single-cell RNA sequencing data are limited by reliance on pre-existing annotations, peak calling heuristics, and collapsing measurements by cell type. We introduce ReadZS, an annotation-free statistical approach to identify regulated RNA processing in single cells. ReadZS discovers cell type-specific RNA processing in human lung and conserved, developmentally regulated RNA processing in mammalian spermatogenesis-including global 3' UTR shortening in human spermatogenesis. ReadZS also discovers global 3' UTR lengthening in Arabidopsis development, highlighting the usefulness of this method in under-annotated transcriptomes.


Asunto(s)
Poliadenilación , Transcriptoma , Animales , Humanos , Regiones no Traducidas 3' , RNA-Seq , Análisis de Secuencia de ARN/métodos , Mamíferos/genética
13.
Nucleic Acids Res ; 50(21): 12400-12424, 2022 11 28.
Artículo en Inglés | MEDLINE | ID: mdl-35947650

RESUMEN

Trimethylguanosine synthase 1 (TGS1) is a highly conserved enzyme that converts the 5'-monomethylguanosine cap of small nuclear RNAs (snRNAs) to a trimethylguanosine cap. Here, we show that loss of TGS1 in Caenorhabditis elegans, Drosophila melanogaster and Danio rerio results in neurological phenotypes similar to those caused by survival motor neuron (SMN) deficiency. Importantly, expression of human TGS1 ameliorates the SMN-dependent neurological phenotypes in both flies and worms, revealing that TGS1 can partly counteract the effects of SMN deficiency. TGS1 loss in HeLa cells leads to the accumulation of immature U2 and U4atac snRNAs with long 3' tails that are often uridylated. snRNAs with defective 3' terminations also accumulate in Drosophila Tgs1 mutants. Consistent with defective snRNA maturation, TGS1 and SMN mutant cells also exhibit partially overlapping transcriptome alterations that include aberrantly spliced and readthrough transcripts. Together, these results identify a neuroprotective function for TGS1 and reinforce the view that defective snRNA maturation affects neuronal viability and function.


Asunto(s)
Metiltransferasas , Neuronas Motoras , ARN Nuclear Pequeño , Animales , Humanos , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Drosophila/genética , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Células HeLa , Neuronas Motoras/metabolismo , Neuronas Motoras/patología , Fenotipo , ARN Nuclear Pequeño/metabolismo , Metiltransferasas/metabolismo
14.
Science ; 376(6594): eabl4896, 2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35549404

RESUMEN

Molecular characterization of cell types using single-cell transcriptome sequencing is revolutionizing cell biology and enabling new insights into the physiology of human organs. We created a human reference atlas comprising nearly 500,000 cells from 24 different tissues and organs, many from the same donor. This atlas enabled molecular characterization of more than 400 cell types, their distribution across tissues, and tissue-specific variation in gene expression. Using multiple tissues from a single donor enabled identification of the clonal distribution of T cells between tissues, identification of the tissue-specific mutation rate in B cells, and analysis of the cell cycle state and proliferative potential of shared cell types across tissues. Cell type-specific RNA splicing was discovered and analyzed across tissues within an individual.


Asunto(s)
Atlas como Asunto , Células , Especificidad de Órganos , Empalme del ARN , Análisis de la Célula Individual , Transcriptoma , Linfocitos B/metabolismo , Células/metabolismo , Humanos , Especificidad de Órganos/genética , Linfocitos T/metabolismo
16.
Nat Methods ; 19(3): 307-310, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35241832

RESUMEN

Detecting single-cell-regulated splicing from droplet-based technologies is challenging. Here, we introduce the splicing Z score (SpliZ), an annotation-free statistical method to detect regulated splicing in single-cell RNA sequencing. We applied the SpliZ to human lung cells, discovering hundreds of genes with cell-type-specific splicing patterns including ones with potential implications for basic and translational biology.


Asunto(s)
Empalme Alternativo , Empalme del ARN , Humanos
17.
Elife ; 102021 09 13.
Artículo en Inglés | MEDLINE | ID: mdl-34515025

RESUMEN

The extent splicing is regulated at single-cell resolution has remained controversial due to both available data and methods to interpret it. We apply the SpliZ, a new statistical approach, to detect cell-type-specific splicing in >110K cells from 12 human tissues. Using 10X Chromium data for discovery, 9.1% of genes with computable SpliZ scores are cell-type-specifically spliced, including ubiquitously expressed genes MYL6 and RPS24. These results are validated with RNA FISH, single-cell PCR, and Smart-seq2. SpliZ analysis reveals 170 genes with regulated splicing during human spermatogenesis, including examples conserved in mouse and mouse lemur. The SpliZ allows model-based identification of subpopulations indistinguishable based on gene expression, illustrated by subpopulation-specific splicing of classical monocytes involving an ultraconserved exon in SAT1. Together, this analysis of differential splicing across multiple organs establishes that splicing is regulated cell-type-specifically.


Asunto(s)
Cheirogaleidae/genética , Ratones/genética , Empalme del ARN , Análisis de la Célula Individual , Animales
18.
Genome Biol ; 22(1): 219, 2021 08 05.
Artículo en Inglés | MEDLINE | ID: mdl-34353340

RESUMEN

Precise splice junction calls are currently unavailable in scRNA-seq pipelines such as the 10x Chromium platform but are critical for understanding single-cell biology. Here, we introduce SICILIAN, a new method that assigns statistical confidence to splice junctions from a spliced aligner to improve precision. SICILIAN is a general method that can be applied to bulk or single-cell data, but has particular utility for single-cell analysis due to that data's unique challenges and opportunities for discovery. SICILIAN's precise splice detection achieves high accuracy on simulated data, improves concordance between matched single-cell and bulk datasets, and increases agreement between biological replicates. SICILIAN detects unannotated splicing in single cells, enabling the discovery of novel splicing regulation through single-cell analysis workflows.


Asunto(s)
Empalme del ARN , Análisis de la Célula Individual , Algoritmos , Empalme Alternativo , Animales , Biología Computacional/métodos , Entropía , Humanos , Ratones , Análisis de Secuencia de ARN/métodos
19.
medRxiv ; 2020 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-32766602

RESUMEN

During COVID19 and other viral pandemics, rapid generation of host and pathogen genomic data is critical to tracking infection and informing therapies. There is an urgent need for efficient approaches to this data generation at scale. We have developed a scalable, high throughput approach to generate high fidelity low pass whole genome and HLA sequencing, viral genomes, and representation of human transcriptome from single nasopharyngeal swabs of COVID19 patients.

20.
PLoS Comput Biol ; 15(12): e1007537, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31830035

RESUMEN

Next-generation sequencing is a cutting edge technology, but to quantify a dynamic range of abundances for different RNA or DNA species requires increasing sampling depth to levels that can be prohibitively expensive due to physical limits on molecular throughput of sequencers. To overcome this problem, we introduce a new general sampling theory which uses biophysical principles to functionally encode the abundance of a species before sampling, SeQUential depletIon and enriCHment (SQUICH). In theory and simulation, SQUICH enables sampling at a logarithmic rate to achieve the same precision as attained with conventional sequencing. A simple proof of principle experimental implementation of SQUICH in a controlled complex system of ~262,000 oligonucleotides already reduces sequencing depth by a factor of 10. SQUICH lays the groundwork for a general solution to a fundamental problem in molecular sampling and enables a new generation of efficient, precise molecular measurement at logarithmic or better sampling depth.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuencia de Bases , Biología Computacional , Simulación por Computador , ADN/genética , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Prueba de Estudio Conceptual , ARN/genética , Muestreo , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ADN/estadística & datos numéricos , Análisis de Secuencia de ARN/métodos , Análisis de Secuencia de ARN/estadística & datos numéricos , Especificidad de la Especie
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...